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Abstract 

Facebook is flooded by diverse and heterogeneous con¬ 
tent, from kittens up to music and news, passing 
through satirical and funny stories. Each piece of that 
corpus reflects the heterogeneity of the underlying so¬ 
cial background. In the Italian Facebook we have found 
an interesting case: a page having more than 40 K fol¬ 
lowers that every day posts the same picture of a pop¬ 
ular Italian singer. In this work, we use such a page as 
a control to study and model the relationship between 
content heterogeneity on popularity. In particular, we 
use that page for a comparative analysis of informa¬ 
tion consumption patterns with respect to pages post¬ 
ing science and conspiracy news. In total, we analyze 
about 2 M likes and 190A' comments, made by approxi¬ 
mately 340A and 65 K users, respectively. We conclude 
the paper by introducing a model mimicking users se¬ 
lection preferences accounting for the heterogeneity of 
contents. 

Introduction 

Online social networks such as Facebook foster the ag¬ 
gregation of people around common interests, narra¬ 
tives, and worldviews. Indeed, the World Wide Web 
caused a shift of paradigm in the production and con¬ 
sumption of contents that increased volume and hetero¬ 
geneity of available contents. Users can express their at¬ 
titudes by producing and consuming heterogeneous in¬ 
formation — e.g. conspiracists avoid mainstream news 
and follow their own information sources, whereas de¬ 
bunkers try to inhibit the diffusion of false claims. Im¬ 
ages of kittens and pets, political memes, gossip, scan¬ 
dals spread on Facebook. By liking, commenting, and 
sharing their preferred contents, users can express their 
passions and emotions — and, among these latter, sar¬ 
casm in not an exception. Indeed, not rarely we can find 
pages promoting parodistic and sarcastic imitations of 
online social dynamics — e.g., Ebola and Kittens PQ or 
In favor of chem-trails [2] . An interesting case in the 
Italian Facebook is a page 0 with more than 40A' fol¬ 
lowers that posts everyday the exactly alike picture of 
Toto Cutugno, a famous Italian pop-singer. 
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In this work, we use the intriguing case of that page 
as a baseline to study and model the effect of content 
diversity on popularity. Specifically, we analyze user ac¬ 
tivity and post consumption patterns on the baseline 
page for a timespan of about 4 months. Through a 
comparative analysis between two sets of pages pro¬ 
ducing heterogeneous contents, we show that there 
are no remarkable differences in user activity patterns, 
whereas significant dissimilarities between post con¬ 
sumption patterns emerge. Such a comparative analy¬ 
sis allows to derive a model of information consumption 
accounting for the heterogeneity of contents. Hence, we 
show that the proposed model is able to reproduce the 
phenomenon observed from empirical data. In particu¬ 
lar, we show the effects of different levels of contents’ 
heterogeneity on posts consumption patterns. 

Background and Related Works 

A large body of literature addresses the study of social 
dynamics on socio-technical systems from social conta¬ 
gion up to social reinforcement |41 151151171151151 HU1 fill IT21 
|T31[Tll[Tni[Tni[I71[T51[Tll[2ni[lIl[22]. Among these, one of 
the most defining topic of computational social science 
is the understanding of driving forces behind the pop¬ 
ularity of contents [23) ■ Such a challenge has been ad¬ 
dressed looking at the sentiment of comments, contents, 
or users’ attention [2H US [2(3 [271 1331 HS11301133 132] , 

However, the mechanisms behind popularity remain 
largely unexplored [331 EH [35]. In [36j the authors ad¬ 
dress such a challenge experimentally by measuring the 
impact of content quality and social influence on the 
eventual popularity or success of cultural artifacts. The 
effects of specific contents on the formation of com¬ 
munities of interest, their permeability to false infor¬ 
mation, and the resistance to changes have been re¬ 
cently characterized in [371 133 1311 SO]- In particular, 
in m the authors point out that connectivity patterns 
of the Facebook social network are prominently driven 
by homophily of users — i.e., the tendency of individ¬ 
uals to associate with similar others — towards spe¬ 
cific kinds of contents. Microblogging platforms such 
as Facebook and Twitter [42] have lowered the cost 
of information production and broadcasting, boosting 
the potential reach of each idea or meme [43, 44j 



i.e., content or concepts that spread rapidly on the 
Web. Still, the abundance of information to which we 
are exposed through online social networks and other 
socio-technical systems is exceeding our capacity to con¬ 
sume it [45]. As a result, the dynamics of information is 
driven more than ever before by the economy of atten¬ 
tion 05103021. We address this challenge by studying 
the interlink between contents diversity and popularity. 
More specifically, we investigate the effects of sources 
producing always the same information on users’ activ¬ 
ity, consumption, and attention patterns. 

Data Description 

In this work, we aim at investigating the role of content 
diversity on the dynamics of information consumption 
in online social networks. To this end, we use a set of 
Facebook pages promoting heterogeneous contents and 
a Facebook page promoting always the same picture. 
The set of pages promoting heterogeneous contents is 
composed by 73 public Facebook pages, whereof 34 are 
about scientific news and 39 about conspiratorial news; 
we refer to the former as science pages and to the lat¬ 
ter as conspiracy pages. The page promoting homoge¬ 
neous contents is called ”La stessa foto di Toto Cu- 
tugno ogni giorno” (’’Everyday the same photo of Toto 
Cutugno”, a well-known Italian pop singer-songwriter); 
such a page, by publishing everyday the same picture of 
the Italian singer — and nothing else —, represents the 
perfect control for studying content diversity; we refer 
to this page as the baseline page. Starting from these 
pages, we downloaded all the posts, we collected all the 
likes and comments to the posts, and we counted the 
number of shares. Data related to science and conspir¬ 
acy pages have been collected from August 22, 2013 to 
December 31, 2013, whereas data related to the baseline 
page have been collected from August 22, 2014 (birth- 
date of the page) to December 31, 2014. In total, we 
collected around 2 M likes and 190AT comments, made 
by about 340/\ and 65A' users, respectively. In Table [l] 
we summarize the details of our data collection. Likes, 
shares, and comments have a different meaning from 
the user viewpoint. Most of the times, a like stands for 
a positive feedback to the post; a share expresses the 
will to increase the visibility of a given information; and 
a comment is the way in which online collective debates 
take form. Comments may contain negative or positive 
feedbacks with respect to the post. 


Results and Discussion 

In this section, we first present the statistical signa¬ 
tures characterizing users activity on pages with diver¬ 
sified content on specific topics (science and conspiracy 
news) against the case of the page posting every day 
the same picture (baseline). Then we derive a model 
of information consumption mimicking user preferences 
with respect to contents. 



Total 

Science 

Conspiracy 

Baseline 

Pages 

74 

34 

39 

1 

Posts 

49,354 

13,028 

36,169 

157 

Likes 

2,095,677 

614,078 

1, 184, 084 

297, 515 

Comments 

192,967 

40,608 

138,138 

14, 221 

Shares 

3, 782,480 

477,457 

3,297,687 

7,336 

Likers 

344,367 

162,146 

159,524 

22, 697 

Commenters 

64, 903 

18,358 

41,666 

4,875 


Table 1: Dataset breakdown. The number of pages, 
posts, likes, comments, shares, likers, and commenters 
for science pages, conspiracy pages, and the baseline 
page. 

Content and Users Activity 

Let us focus some regularities concerning users’ activity 
on science pages and conspiracy pages compared with 
the baseline page. Figure [T] shows theprobability den¬ 
sity function (PDF) for the normalized]] number of likes 
for each user. We find that the activity of users presents 
an heavy-tailed distribution. 



normalized number of likes by each user 


Figure 1: Users’ activity patterns. Probability den¬ 
sity function (PDF) for the normalized number of likes 
by each user. 

In Figure[2]we show the PDF of the users’ lifetime in 
terms of their liking activity - i.e. the temporal interval 
between the first and the last like of the user on a given 
page. We find a slight difference in the lifetime of the 
baseline users with respect to science and conspiracy 
users. 

These figures show that users activity patterns are 
similar and present heavy-tailed distributions despite 
the different nature of the contents, and we can not 
find any significant difference between the users inter- 

x We performed the unity-based normalization to bring 
all values in the range [0,1], 














Figure 2: Users’ lifetime. Probability density func¬ 
tion (PDF) of the users’ lifetime in terms of their liking 
activity. The PDF shows a slight difference in the life¬ 
time of the baseline users with respect to science and 
conspiracy users. 

action patterns induced by heterogeneous or homoge¬ 
neous contents. 

Conversely, by analyzing consumption patterns re¬ 
lated to posts, we find a significant difference in the 
information consumption dynamics. Figure [3] shows the 
PDF for the number of likes received by posts belong¬ 
ing to science pages, conspiracy pages, and the baseline 
page. The number of likes received by posts are heavy¬ 
tailed distributed if the posts belong to pages promoting 
heterogeneous contents (science and conspiracy pages); 
whereas they are approximately distributed according 
to a Gaussian if the posts belong to a page promoting 
homogeneous content (baseline page). 

Summarizing, users’ activities always present heavy¬ 
tailed distributions resolving in heavy-tailed dis¬ 
tributed consumption patterns on posts in the hetero¬ 
geneous contents case. Still, when the content promoted 
by a page is homogeneous i.e., always the same - we 
find that the heavy-tailed distributed users’ activities 
resolve in posts’ consumption patterns that are approx¬ 
imately Gaussian. 

Modeling Contents Consumption 

Here we introduce a model of pattern consumption 
that exploits the Beta distribution properties to gen¬ 
erate different levels of posts’ attractiveness, thus vary¬ 
ing content-heterogeneity in the simulated collection of 
posts. 

The Beta distribution is a family of continuous prob¬ 
ability distributions defined in the interval [0,1] and 



normalized number of likes to post 


Figure 3: Posts’ consumption patterns. Probability 
distribution function (PDF) for the normalized number 
of likes received by posts belonging to science pages, 
conspiracy pages, and the baseline page. The PDFs 
show remarkable differences between consumption pat¬ 
terns’ distributions related to pages promoting hetero¬ 
geneous contents and those related to the page promot¬ 
ing homogeneous contents. 


characterized by two real parameters, a > 0 and /3 > 0, 
which control the shape of the distribution. In particu¬ 
lar, for a = 1 and /3 = 1 the Beta distribution Be(a, (3) 
is equivalent to the Uniform distribution U(0, 1). Con¬ 
versely, if a = 1 and (3 > 20, the Beta distribution 
Be(a,j3) is a right heavy-tailed distribution. Figure [d] 
shows the Beta probability density function with re¬ 
spect to the two shape parameters a and (3. 

In our model, each post has a value drawn from a 
Beta distribution v ~ Be( 1, /3), with f3 ranging between 
1 and 1, 000, 000, indicating its attractiveness. We let 
the parameter f3 assume those extreme values in order 
to obtain different distributions for posts’ attractive¬ 
ness. Indeed, notice that when (3 = 1 the Beta distri¬ 
bution Be(l,/3) is equivalent to a uniform distribution 
U( 0,1), so that we have a collection of homogeneous- 
content posts — i.e., each post has the same degree of 
attractiveness; whereas when f3 —> oo the Beta distribu¬ 
tion Be( 1, f3) is equivalent to a right heavy-tailed distri¬ 
bution, so that we have a collection of heterogeneous- 
content posts — i.e., there are few posts with a high 
level of attractiveness, while the vast majority of the 
posts is characterized by a low level of attractiveness. 
Moreover, each user is characterized by two parame¬ 
ters randomly drawn from power law distributions: her 
volume of activity, a ~ p(x)\ and her fixed-preference 
about the posts, b ~ p(x), where p(x) = x _1 with 








Figure 4: Beta distribution Be(a,/3). Two parame¬ 
ters, a and /?, control the shape of the distribution. 
In particular, for a = 1 and /3 = 1 the Beta distribu¬ 
tion Be(a, j3) is equivalent to the Uniform distribution 
U(0, 1). Conversely, if a = 1 and j3 > 20, the Beta dis¬ 
tribution Be(a, f$) is a right heavy-tailed distribution. 

7 = 1.5. Each user can not exceed her assigned volume 
of activity, a, and she likes a given post if and only if 
her normalizcc0 fixed-preference, 6, is smaller than the 
attractiveness, v, of that post. Note that in our model 
we do not take into account the users’ network: since 
Facebook network is very dense - indeed, the diameter 
of Facebook social network is 3.74 mm — the con¬ 
nections between users are not likely to influence posts’ 
consumption dynamics. 

We run simulations for /3 ranging between 1 and 
1,000,000, with P = 10,000 (posts) and U = 20,000 
(users). Results are averaged over 100 iterations. 

Figure [5] shows the probability density function 
(PDF) of the users activity and the posts consump¬ 
tion patterns generated by a simulation of the model 
with /3 = 1,000,000 — i.e., in the case of extremely 
heterogeneous-content posts. Observe that users’ activ¬ 
ity is heavy-tailed, and the distribution of posts’ con¬ 
sumption is skewed. Such a result is consistent with em¬ 
pirical data shown in the previous section: if the content 
promoted by a page is heterogeneous, the heavy-tailed 
users’ activity resolves in skewed posts consumption’s 
patterns. 

Figure [6] shows the probability density function 
(PDF) of the users activity and the posts consumption 

2 Note that we performed a unity- based normalization in 
order to bring all values of b ~ p(x) = x~ 15 in the range 
[0,1], so that the fixed-preference of the user is comparable 
with the attractiveness of the posts. 


patterns generated by a simulation of the model with 
/3 = 1 - i.e., in the case of homogeneous-content posts. 
Notice that users’ activity is heavy-tailed, whereas 
posts’ consumption is approximately Gaussian. Such a 
result is consistent with empirical data shown in the 
previous section: if the content promoted by a page is 
always the same, the heavy-tailed users’ activity re¬ 
solves in approximately Gaussian posts consumption’s 
patterns. 

Concluding Remarks 

Facebook is overflowed by different and heterogeneous 
contents, from the latest news up to satirical and funny 
stories. Each piece of that corpus reflects the hetero¬ 
geneity of the underlying social background. Indeed, the 
World Wide Web caused a shift of paradigm in the pro¬ 
duction and consumption of information that increased 
the amount and heterogeneity of contents available to 
users. On online social networks such as Twitter and 
Facebook, people can express their attitudes, passions, 
and emotions by producing and consuming heteroge¬ 
neous information. 

In the Italian Facebook, we have found a fascinating 
case of contents’ homogeneity: a page with more than 
40A' followers that every day posts the same picture of 
Toto Cutugno, a popular Italian singer. In this work, 
we use such a page as a benchmark to investigate and 
model the effect of contents heterogeneity on popular¬ 
ity. In particular, we use that page for a comparative 
analysis of information consumption patterns with re¬ 
spect to pages posting heterogeneous contents related 
to science and conspiracy. 

We show that there are not remarkable differences 
in user activity patterns, whereas we find significant 
dissimilarities between post consumption patterns of 
the page promoting homogeneous contents and those 
of the pages producing heterogeneous contents. Finally, 
we derive a model of information consumption that ac¬ 
counts for the heterogeneity of contents. Hence, we show 
that the proposed model is able to reproduce the phe¬ 
nomenon observed from empirical data. 
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